Skip to content

[Model] add optimal triton fused moe configs for NemotronH MoE#27967

Merged
heheda12345 merged 3 commits intovllm-project:mainfrom
tomeras91:add-nemotronH-moe-configs
Nov 4, 2025
Merged

[Model] add optimal triton fused moe configs for NemotronH MoE#27967
heheda12345 merged 3 commits intovllm-project:mainfrom
tomeras91:add-nemotronH-moe-configs

Conversation

@tomeras91
Copy link
Copy Markdown
Member

@tomeras91 tomeras91 commented Nov 3, 2025

Add optimal triton fused moe configs for NemotronH MoE.

Added configs for:

  1. TP=1,2
  2. device = H100 HBM3 80GB, L40S
  3. dtype=BF16

The configs were generated by running /benchmarks/kernels/benchmark_moe.py

Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
@mergify mergify Bot added the performance Performance-related issues label Nov 3, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds optimal Triton fused MoE configurations for the NemotronH MoE model, enhancing its performance on NVIDIA H100 and L40S GPUs. The changes include updating the benchmark script to recognize NemotronHForCausalLM and adding the corresponding generated JSON configuration files. The modifications are straightforward and appear correct. No issues were found.

@heheda12345 heheda12345 changed the title [Model] app optimal triton fused moe configs for NemotronH MoE [Model] add optimal triton fused moe configs for NemotronH MoE Nov 4, 2025
@heheda12345 heheda12345 enabled auto-merge (squash) November 4, 2025 05:24
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 4, 2025
@heheda12345 heheda12345 merged commit e4ee658 into vllm-project:main Nov 4, 2025
51 checks passed
@tomeras91 tomeras91 deleted the add-nemotronH-moe-configs branch November 4, 2025 13:25
ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025
…project#27967)

Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
…project#27967)

Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants